Improved Gini-Index Algorithm to Correct Feature-Selection Bias in Text Classification
                    
                        
                            نویسندگان
                            
                            
                        
                        
                    
                    
                    چکیده
منابع مشابه
An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved Text Classification Method Based on Gini Index
In text classification, the purity of the Gini index can be used. When purity value is greater, the characteristic of the information contained in the attribute is higher, and the feature distinguishing capability is stronger. But using the Gini purity formula on feature weight, the classification result is not very good, one of the main reasons is those rare words only appearing in one categor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2011
ISSN: 0916-8532,1745-1361
DOI: 10.1587/transinf.e94.d.855